Picture for Zhaowei Zhang

Zhaowei Zhang

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Add code
May 28, 2026
Viaarxiv icon

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

Add code
Mar 11, 2026
Viaarxiv icon

EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding

Add code
May 26, 2025
Viaarxiv icon

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Add code
Feb 26, 2025
Figure 1 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 2 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 3 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Figure 4 for Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Viaarxiv icon

Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment

Add code
Oct 22, 2024
Figure 1 for Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment
Figure 2 for Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment
Figure 3 for Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment
Figure 4 for Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Models Alignment
Viaarxiv icon

Efficient Model-agnostic Alignment via Bayesian Persuasion

Add code
May 29, 2024
Figure 1 for Efficient Model-agnostic Alignment via Bayesian Persuasion
Figure 2 for Efficient Model-agnostic Alignment via Bayesian Persuasion
Figure 3 for Efficient Model-agnostic Alignment via Bayesian Persuasion
Figure 4 for Efficient Model-agnostic Alignment via Bayesian Persuasion
Viaarxiv icon

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Figure 1 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 2 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 3 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 4 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Mar 01, 2024
Figure 1 for Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Figure 2 for Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Viaarxiv icon

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents

Add code
Jan 19, 2024
Figure 1 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 2 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 3 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 4 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Viaarxiv icon

AI Alignment: A Comprehensive Survey

Add code
Nov 01, 2023
Viaarxiv icon